Index Combinations and Query Reformulations for Mixed Monolingual Web Retrieval
نویسندگان
چکیده
We examine the effectiveness on the multilingual WebCLEF 2006 test set of light-weight methods that have proved successful in other web retrieval settings: combinations of document representations on the one hand and query reformulation techniques on the other. We investigate a range of approaches to crosslingual web retrieval using the test suite of the mixed monolingual CLEF 2006 WebCLEF track, featuring a stream of known-item topics in various languages. The topics are a mixture of manual (human generated) and automatically generated topics. We examine the robustness of well-known web retrieval techniques on this test set: compact document representations (titles or incoming anchor-texts), and query reformulation techniques. In Section 1 we describe our retrieval system as well as the approaches we applied. In Section 2 we describe our experiments, while the results are detailed in Section 3. We conclude in Section 4. For details on the WebCLEF collection and on the topics used we refer to [1]. 1 System Description Our retrieval system is based on the Lucene engine [5]. For ranking, we used the default similarity measure of Lucene, i.e., for a collection D, document d and query q containing terms ti:
منابع مشابه
Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type
Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...
متن کاملNTCIR-5 Chinese, English, Korean Cross Language Retrieval Experiments using PIRCS
In NTCIR-5 our focus is to see if web-assisted query expansion is useful, and to test an EnglishKorean bilingual dictionary. We participated in Chinese, Japanese, Korean and English monolingual retrieval using also web expansion for Chinese and English. We also performed Chinese-English, English-Chinese, English-Korean bilingual, and Chinese-Korean pivot bilingual CLIR. The query translation ap...
متن کاملUniversity of Hagen at GeoCLEF 2008: Combining IR and QA for Geographic Information Retrieval
This paper describes the participation of GIRSA at GeoCLEF 2008, the geographic information retrieval task at CLEF. GIRSA is a modified and improved variant of the system which participated at GeoCLEF 2007. It combines results retrieved with methods from information retrieval (IR) on geographically annotated data and question answering (QA) employing query decomposition. For the monolingual Ger...
متن کاملMelange: Components for Cross-Lingual Retrieval
We present the finalized version of our cross-lingual search engine Melange, and results obtained by running it on WebCLEF topics in an attempt to solve Mixed Monolingual and Multilingual tasks. We concentrate on certain features of the system which are relevant to the CLIR field and which can be developed further independently. These are our data extraction and indexing methods, our language d...
متن کاملSyntactic and Semantic Structure in Web Search Queries
Traditionally, information retrieval examines the search query in isolation: a query is used to retrieve documents, and the relevance of the documents returned is evaluated in relation to that query. The query itself is assumed to consist of a bag of words, without any grammatical structure. However, queries can also be shown to exhibit grammatical structure, often consisting of telegraphic nou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006